vSphere to Google Drive VM Backup Pipeline

Sequential, Resumable, and Zero-Local-Storage-Overflow Automation
Target Environment: Intel-based MacBook Pro (x86_64)

Overview

This automated setup is explicitly designed to handle backing up large virtual machines from your vSphere environment (10.11.11.13) to a Google Drive account without overflowing your MacBook Pro's local disk storage. It checks for powered-off VMs, exports them one by one as an OVA, uploads them to cloud storage, and safely deletes the local copy before proceeding to the next machine.

Step 1: Install Local Host Dependencies

On your brand new MacBook Pro, open your Terminal and ensure you have Homebrew, Docker Desktop, and rclone installed. Run the following commands sequentially:

Terminal Commands
# Install Homebrew (if not already installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install Rclone locally to handle the initial Google Drive authorization
brew install rclone

# Install Docker Desktop via Homebrew Cask
brew install --cask docker
Intel Architecture Note

Since your Mac uses an Intel processor, Docker containers build natively as linux/amd64. The govc configuration inside our Dockerfile is explicitly targeted at the x86_64 Linux architecture, which will run at peak performance without any emulation layers.

Step 2: Configure Google Drive Remote Access

Because Google Drive utilizes dynamic OAuth authentication, you must perform a one-time setup on your host Mac browser to link your storage account:

  1. Run rclone config in your terminal.
  2. Type n to create a new remote, and name it exactly: gdrive.
  3. When prompted for the storage type, find and select Google Drive (usually option number drive).
  4. Leave client ID and client secret blank (press Enter).
  5. Choose full access (option 1).
  6. When asked to use auto-config, select Yes (y). This will open Safari/Chrome to authorize your Google Workspace account.

Step 3: Create Project Configuration Files

Create a dedicated folder on your Mac (e.g., inside your home directory) and place the following two files inside it.

1. Dockerfile

This environment packs Python, the native Intel-compiled govc CLI binary, and the cloud engine rclone.

Dockerfile
FROM python:3.11-slim

# Install system dependencies
RUN apt-get update && apt-get install -y \\
    curl \\
    unzip \\
    && rm -rf /var/lib/apt/lists/*

# Install native Intel/AMD64 govc (vSphere CLI)
RUN curl -L -o - "https://github.com/vmware/govmomi/releases/latest/download/govc_Linux_x86_64.tar.gz" | tar -C /usr/local/bin -xvzf - govc

# Install rclone (Google Drive Engine)
RUN curl https://rclone.org/install.sh | bash

# Set up working directory structure
WORKDIR /app
COPY backup.py /app/backup.py

RUN mkdir -p /app/data /app/export

ENTRYPOINT ["python3", "/app/backup.py"]

2. Automation Script (backup.py)

This python script coordinates your state tracking to guarantee network drop resiliency.

backup.py
import os
import json
import subprocess
import sys

STATE_FILE = "/app/data/state.json"
EXPORT_DIR = "/app/export"

def run_command(command, shell=False):
    try:
        result = subprocess.run(command, shell=shell, check=True, capture_output=True, text=True)
        return result.stdout.strip()
    except subprocess.CalledProcessError as e:
        print(f"Execution Error: {e.cmd}\\nStderr: {e.stderr}")
        return None

def load_state():
    if os.path.exists(STATE_FILE):
        with open(STATE_FILE, 'r') as f:
            return json.load(f)
    return {}

def save_state(state):
    with open(STATE_FILE, 'w') as f:
        json.dump(state, f, indent=4)

def get_powered_off_vms():
    print("Querying vSphere infrastructure for powered-off inventory...")
    output = run_command(["govc", "find", ".", "-type", "m", "-runtime.powerState", "poweredOff"])
    if not output:
        return []
    return [line for line in output.split('\\n') if line]

def main():
    required_envs = ["GOVC_URL", "GOVC_USERNAME", "GOVC_PASSWORD"]
    for env in required_envs:
        if env not in os.environ:
            print(f"Execution failed. Missing environment configuration variable: {env}")
            sys.exit(1)

    state = load_state()
    vms = get_powered_off_vms()

    if not vms:
        print("No powered-off target virtual machines detected.")
        return

    print(f"Discovered {len(vms)} powered-off VMs queued for dynamic backup.")

    for vm_path in vms:
        vm_name = vm_path.split('/')[-1]
        
        if state.get(vm_name) == "COMPLETED":
            print(f"[{vm_name}] Backup verification exists on Google Drive. Skipping.")
            continue

        print(f"\\n--- Activating Stream For: {vm_name} ---")
        ova_path = os.path.join(EXPORT_DIR, f"{vm_name}/{vm_name}.ova")
        
        if state.get(vm_name) != "DOWNLOADED":
            print(f"[{vm_name}] Executing safe OVF/OVA export stream...")
            os.makedirs(os.path.dirname(ova_path), exist_ok=True)
            
            export_res = run_command(["govc", "vm.export", "-ova", "-vm", vm_path, os.path.dirname(ova_path)])
            if export_res is None:
                print(f"[{vm_name}] Export failed. Network socket open, retrying on next daemon run.")
                continue
            
            state[vm_name] = "DOWNLOADED"
            save_state(state)
            print(f"[{vm_name}] Local export written successfully to scratch drive partition.")

        print(f"[{vm_name}] Offloading stream to Google Drive and pruning local workspace blocks...")
        
        # 'rclone move' guarantees deletion ONLY if cloud hashes match completely
        upload_cmd = [
            "rclone", "move", 
            os.path.dirname(ova_path), 
            f"gdrive:vSphere_Backups/{vm_name}",
            "--drive-chunk-size", "64M", 
            "--retries", "5",
            "--low-level-retries", "10"
        ]
        
        upload_res = run_command(upload_cmd)
        if upload_res is None:
            print(f"[{vm_name}] Cloud pipe upload interrupted. Retaining scratch workspace asset for resumption.")
            continue

        state[vm_name] = "COMPLETED"
        save_state(state)
        print(f"[{vm_name}] Operations finalized. Local data purged, Google cloud object sealed.")

    print("\\nGlobal infrastructure backup pipeline task completed.")

if __name__ == "__main__":
    main()

Step 4: Build & Deploy Container Engine

Navigate into the directory containing your freshly created files via terminal, build your isolated engine, and execute it. Make sure to replace YourActualPassword with your actual active vSphere credentials.

Build & Execution Commands
# Step A: Build the container image natively for your Intel processor
docker build -t vsphere-backup .

# Step B: Create a directory on your desktop to safely hold local state tracking files
mkdir -p ~/Desktop/backup_data
mkdir -p ~/Desktop/backup_temp

# Step C: Run the orchestrated container pipeline
docker run -d \\
  --name vm_backup_runner \\
  -e GOVC_URL="10.11.11.13" \\
  -e GOVC_USERNAME="Administrator@VSPHERE.LOCAL" \\
  -e GOVC_PASSWORD="YourActualPassword" \\
  -e GOVC_INSECURE="true" \\
  -v ~/.config/rclone/rclone.conf:/root/.config/rclone/rclone.conf \\
  -v ~/Desktop/backup_data:/app/data \\
  -v ~/Desktop/backup_temp:/app/export \\
  vsphere-backup

Monitoring and Network Resumption

To view your pipeline's real-time progress, run the tracking log stream from your terminal window:

Monitoring Commands
# View live operational log output stream
docker logs -f vm_backup_runner

If your MacBook drops Wi-Fi or you change locations, simply remove the dead container worker and restart it. The script will safely read the state inside ~/Desktop/backup_data/state.json, notice what was interrupted, skip all successful uploads, and instantly resume your progress:

Resumption Template
# Safely flush the interrupted docker execution block
docker rm -f vm_backup_runner

# Re-run the exact same step C docker command above to resume instantly